Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries
نویسندگان
چکیده
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. The prosody attribute that we investigate in this study is the duration lengthening effects of the speech segments in the vicinity of intonational phrase boundaries. Explicit Duration Hidden Markov Model (EDHMM) is implemented to provide an accurate phoneme duration model. This study is conducted on Boston University Radio New Corpus with prosodic boundaries marked using ToBI labelling system. We found that lengthening of the phrase final rhymes can be reliably modelled by EDHMM, which significantly improves the prosody dependent acoustic modelling. Conversely, no systematic duration variation is found at phrase initial position. With prosody dependence implemented in acoustic model, pronunciation model and language model, both word recognition accuracy and boundary recognition accuracy are improved by 1% over systems without prosody dependence.
منابع مشابه
Prosody Dependent Speech Reco Duration Modelling at Intonatio
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. The prosody attribute that we investigate in this study is the lengthening of speech segments in the vicinity of intonational phrase boundaries. Explicit Duration Hidden Markov Model (EDHMM) is implemented to pr...
متن کاملAn Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. We describe the idea of prosody dependent speech recognition by building a prosody dependent speech recognizer that conditions word and phoneme models on two important prosodic variables: intonational phrase bou...
متن کاملAutomatic Prosody Labeling Final Project Report for EE 6820 - Spring 05 Professor : Dan
Automatic transcription of prosody is necessary for spoken language understanding. Prominence and intonational boundaries are routinely used to convey meaning beyond that expressed in the lexical content of speech. Using a classiÞcation rule learning algorithm and computationally light acoustic and syntactic features, detection of pitch accent at 87% on spontaneous elicited speech were attained...
متن کاملProsodic correlates of directly reported speech: Evidence from conversational speech
This paper investigates the prosodic characteristics of reported speech in the Switchboard corpus. We find that directly reported speech is signalled by a greater overall pitch range than the surrounding narrative material and is typically preceded by intonational phrase boundaries. By contrast, prosody does not seem to distinguish indirectly reported speech from ordinary narrative speech. The ...
متن کاملImproving the Robustness of Prosody Dependent Language Modeling Based on Prosody Syntax Dependence
This paper presents a novel approach that improves the robustness of prosody dependent language modeling by leveraging the dependence between prosody and syntax. A prosody dependent language model describes the joint probability distribution of concurrent word and prosody sequences and can be used to provide prior language constraints in a prosody dependent speech recognizer. Robust Maximum Lik...
متن کامل